Final Project: Data visualisation and analysis of social responsibilities of global warming in terms of CO2 emissions and average temperature

by Kathrine Schultz-Nielsen (s183929), David Ribberholt Ipsen (s164522), Hanlu He (s183909)


Course: 02806 - Social data analysis and visualization spring 2022
Course responsible: Sune Lehmann Jørgensen
DTU - Technical University of Denmark

1. Introduction and Motivation

Global warming is no longer a problem we can look the other way and threat on the earth’s ecosystem, subsequently our survival increases as we speak. CO2 emission is what most considers as the primary driver for climate change for its impact on increasing average temperature of the earth. The amount of atmospheric concentration of carbon dioxide has increased by 50% since the industrilisation era due to human activities [1]. With increasing living standards over the past decades were a result of us exploiting non-renewable enegry sources such as coal, oil and natural gas, lack of awareness of sustainable infrastructure design in urbanization processes and rapid population growth [2]. And we are already facing many of the consequences of global warming such as ice sheets melting, more frequent forest wildfires, rising sea level, more intense heat waves and appearence of more extreme temperatures [3]. These changes are esimtated to be irreversible even 1000 years after if we manage to stop CO2 emissions [4]. We have slowly came to realise the seriousenss of the issue and began drawing out plans to slow down the rising temperature. The paris agreement in 2015 is the largest agreement that binded 196 countries in the world to take actions against global warming, with the ambitious goal of limiting temperature increase to 1.5 degree celcius [5]. In this project we will be presenting the global warming trend with CO2 emissions and global recorded temperature data, assessing contribution of various social factors to CO2 emissions, dig deeper into nation-wise responsibilities for CO2 emissions and lastly looking into the future of temperature rise based on current climate change mitigation policies.


We will be working primarily with two datasets for this project, the Our World in Data CO2 and Greenhouse Gas Emissions database, Climate watch emissions by sector database and the Climate Change: Earth Surface Temperature Data. The reason for choosing the two datasets is because to assess the impact of global warming interms of CO2 emissions and rising temperature requires extensive information in the geographical and time domain, as the matter concerns all of us no matter where we are from and comparison of the past to the present so we can look into the future. And the two datasets do exactly that, with:

The Our Wolrd in Data CO2 and Greenhouse Gas Emission database contains 21591 records of CO2 emissions and various relevant variables such as energy consumption, gdp, population, CO2 emissions by industrial sector etc. for 243 countries and regions with the earliest record from 1750 and newest from 2020.

The Climate watch emissiosn by sector database contains data of CO2 emissions in million tonnes by sectors: Transporation, Building, Industrial Processes for each country from 2018

The Climate Change: Earth Surface Temperature Data contains a collection data files with 8 million+ records of average temperature by cities and countries around the world.

Therefore, we believe the two datasets compliments well with each other to provide us with sufficient information for our visualisations


To tell our story, we aim to find a balance between author driven and user driven narrative with interactive slide shows and annotated graphs. In this way, we are able to present the facts as well as allowing the users to make up their own opinion on the what, why and how of global warming, as considerable number of factors and nations are involved.

2. Dataset segmentation and cleaning

Before we delve into the deeper analysis of this project, let us import all the necessary libraries for this project:

# Basic libraries

import numpy as np
import pandas as pd
import json
import matplotlib.pyplot as plt
import seaborn as sns
import math
import time
import itertools

import warnings
warnings.filterwarnings('ignore')
# helper 

import pycountry_convert as pc

def country_to_continent(country_code):
    try:
        country_alpha2 = pc.country_alpha3_to_country_alpha2(country_code)
        country_continent_code = pc.country_alpha2_to_continent_code(country_alpha2)
        country_continent_name = pc.convert_continent_code_to_continent_name(country_continent_code)
    except: 
        country_continent_name = country_code
        
    return country_continent_name
# Display libraries

from IPython.display import IFrame, display
from IPython.core.display import display as disp, HTML
from ipywidgets import widgets, interact, interactive, fixed, interact_manual

from plotly import __version__
import plotly.express as px
import plotly.offline as pyo
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go
from plotly.subplots import make_subplots

pio.renderers.default = 'notebook'

Now, we will proceed to filtering, subsetting and cleaning the original datasets

Filtering & subsetting the full dataset

df_co2 = pd.read_csv('data/owid-co2-data.csv',sep = ',',encoding = 'unicode_escape')
df_co2
iso_code country year co2 co2_per_capita trade_co2 cement_co2 cement_co2_per_capita coal_co2 coal_co2_per_capita ... ghg_excluding_lucf_per_capita methane methane_per_capita nitrous_oxide nitrous_oxide_per_capita population gdp primary_energy_consumption energy_per_capita energy_per_gdp
0 AFG Afghanistan 1949 0.015 0.002 NaN NaN NaN 0.015 0.002 ... NaN NaN NaN NaN NaN 7624058.0 NaN NaN NaN NaN
1 AFG Afghanistan 1950 0.084 0.011 NaN NaN NaN 0.021 0.003 ... NaN NaN NaN NaN NaN 7752117.0 9.421400e+09 NaN NaN NaN
2 AFG Afghanistan 1951 0.092 0.012 NaN NaN NaN 0.026 0.003 ... NaN NaN NaN NaN NaN 7840151.0 9.692280e+09 NaN NaN NaN
3 AFG Afghanistan 1952 0.092 0.012 NaN NaN NaN 0.032 0.004 ... NaN NaN NaN NaN NaN 7935996.0 1.001732e+10 NaN NaN NaN
4 AFG Afghanistan 1953 0.106 0.013 NaN NaN NaN 0.038 0.005 ... NaN NaN NaN NaN NaN 8039684.0 1.063052e+10 NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
25186 ZWE Zimbabwe 2016 10.738 0.765 1.415 0.639 0.046 6.959 0.496 ... 2.076 11.50 0.820 6.21 0.443 14030338.0 2.096179e+10 47.5 3385.574 1.889
25187 ZWE Zimbabwe 2017 9.582 0.673 1.666 0.678 0.048 5.665 0.398 ... 2.023 11.62 0.816 6.35 0.446 14236599.0 2.194784e+10 NaN NaN NaN
25188 ZWE Zimbabwe 2018 11.854 0.821 1.308 0.697 0.048 7.101 0.492 ... 2.173 11.96 0.828 6.59 0.456 14438812.0 2.271535e+10 NaN NaN NaN
25189 ZWE Zimbabwe 2019 10.949 0.748 1.473 0.697 0.048 6.020 0.411 ... NaN NaN NaN NaN NaN 14645473.0 NaN NaN NaN NaN
25190 ZWE Zimbabwe 2020 10.531 0.709 NaN 0.697 0.047 6.257 0.421 ... NaN NaN NaN NaN NaN 14862927.0 NaN NaN NaN NaN

25191 rows × 60 columns

df_co2['continent'] = df_co2['iso_code'].apply(lambda x: country_to_continent(x))

3. Descriptive statistics about the dataset

After the full segmentation and cleaning has been performed, let us try to understand our dataset a little bit better. We will afterwards jump into the Text Analytics part for finding important top business keywords, so it is important to know how large a scale does our analysis have to deal with.

df_co2
iso_code country year co2 co2_per_capita trade_co2 cement_co2 cement_co2_per_capita coal_co2 coal_co2_per_capita ... methane methane_per_capita nitrous_oxide nitrous_oxide_per_capita population gdp primary_energy_consumption energy_per_capita energy_per_gdp continent
0 AFG Afghanistan 1949 0.015 0.002 NaN NaN NaN 0.015 0.002 ... NaN NaN NaN NaN 7624058.0 NaN NaN NaN NaN Asia
1 AFG Afghanistan 1950 0.084 0.011 NaN NaN NaN 0.021 0.003 ... NaN NaN NaN NaN 7752117.0 9.421400e+09 NaN NaN NaN Asia
2 AFG Afghanistan 1951 0.092 0.012 NaN NaN NaN 0.026 0.003 ... NaN NaN NaN NaN 7840151.0 9.692280e+09 NaN NaN NaN Asia
3 AFG Afghanistan 1952 0.092 0.012 NaN NaN NaN 0.032 0.004 ... NaN NaN NaN NaN 7935996.0 1.001732e+10 NaN NaN NaN Asia
4 AFG Afghanistan 1953 0.106 0.013 NaN NaN NaN 0.038 0.005 ... NaN NaN NaN NaN 8039684.0 1.063052e+10 NaN NaN NaN Asia
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
25186 ZWE Zimbabwe 2016 10.738 0.765 1.415 0.639 0.046 6.959 0.496 ... 11.50 0.820 6.21 0.443 14030338.0 2.096179e+10 47.5 3385.574 1.889 Africa
25187 ZWE Zimbabwe 2017 9.582 0.673 1.666 0.678 0.048 5.665 0.398 ... 11.62 0.816 6.35 0.446 14236599.0 2.194784e+10 NaN NaN NaN Africa
25188 ZWE Zimbabwe 2018 11.854 0.821 1.308 0.697 0.048 7.101 0.492 ... 11.96 0.828 6.59 0.456 14438812.0 2.271535e+10 NaN NaN NaN Africa
25189 ZWE Zimbabwe 2019 10.949 0.748 1.473 0.697 0.048 6.020 0.411 ... NaN NaN NaN NaN 14645473.0 NaN NaN NaN NaN Africa
25190 ZWE Zimbabwe 2020 10.531 0.709 NaN 0.697 0.047 6.257 0.421 ... NaN NaN NaN NaN 14862927.0 NaN NaN NaN NaN Africa

25191 rows × 61 columns


4. The apparent problem: The Earth is heating up


5. What is Causing global warming?

5.1. Population growth

It is more than apparent that the world population is constantly growing, yet the resources on earth does not grow with it. Every life that comes to join the big party has demand for food, shelter, clothing, and due to advancement in technology and living standards, the list has now become extensive with Mcdonalds to michellin star, straw roof to golden roof, cheap fast fashion to luxury clothing, iphone 13 and its dupes, scooters to ferarris etc. All that demands is shouting production, production and production! Which inevitably increases fossil fuel burning and thereby CO2 emissions. It has been found that 1% of population growth mounts to 1.28% increase in average CO2 emission and thereby, results in global warming [6]. Let’s take a look how the world’s average CO2 emissions has changed with respect to population growth over the 30 years span from 1990 to 2020.

## fill 
df_co2 = df_co2.fillna(0)
## select needed columns 
df_pop = df_co2[['iso_code', 'country', 'year', 'co2', 'co2_per_capita','population','continent']] 

## extract data for all countries from 1990 to 2020
df_pop_30 = df_pop[(df_pop['year'] > 1989) & (df_pop['year'] < 2021)]

world = df_pop_30[df_pop_30['country'] == 'World']
world['pop_change'] = world['population'].pct_change()*100
world['co2_change'] = world['co2'].pct_change()*100

df_world =pd.melt(world[['year','co2','population','co2_change','pop_change']],id_vars=['year'],var_name='CO2_pop', value_name='value')
df_world = df_world.fillna(0)

fig = go.Figure(
    data=[
        go.Scatter(name='CO2 emissions', x=sorted(list(set(df_world.year))), y=df_world[df_world['CO2_pop']=='co2'].value/1000, yaxis = 'y'),
        go.Scatter(name='Population', x=sorted(list(set(df_world.year))), y=df_world[df_world['CO2_pop']=='population'].value, yaxis ='y2')
    ],
    layout={
        'yaxis': {'title': 'Average CO2 emissions (billion t)'},
        'yaxis2': {'title': 'Population', 'overlaying': 'y', 'side': 'right'}
    }
)

# Change the bar mode
fig.update_layout(barmode='group',title = 'Change in worlds annual CO2 emissions and total population')

fig.show()

It is quite apparent that both the world’s population and annual CO2 emissions have been increasing and the perentage change ratio between them has been greater than what was said to be 1:1.28 up until 2010. Around 2015,the increase of average CO2 emissions reached a plateu and in 2020 there is a rather significant decrease, meanwhile the population continues to grow, which is reflected in both plots. The reason for the slowed increase of CO2 emissions in 2015 is most likely due to more global efforts in tackling climate change, especially china for its tightened policies to deal with severe air pollution by burning less coal [7]. And in 2020, the pandemic put the world in lockdown which either halted or reduced almost all industrial activities, hence the significant drop in average CO2 emmissions.

years = [1990,1995,2000,2005,2010,2015,2020]
df_world = df_world[df_world['year'].isin(years)]
change_ratio = np.round(df_world[df_world['CO2_pop']=='co2_change']['value'].values/df_world[df_world['CO2_pop']=='pop_change']['value'].values,3)

fig = make_subplots(rows=1, cols=2)
fig = go.Figure(data=[
    go.Bar(name='CO2 emissions', x=years, y=df_world[df_world['CO2_pop']=='co2_change'].value),
    go.Bar(name='Population', x=years, y=df_world[df_world['CO2_pop']=='pop_change'].value),
    go.Scatter(name = 'Change ratio', x = years, y = change_ratio, text = change_ratio ,textposition="top center",mode='lines+markers+text')
],
layout={
        'yaxis': {'title': 'Percentage change (%)'}
    })
# Change the bar mode
fig.update_layout(barmode='group',title='Percentage changes of annual CO2 emissions and population')
fig.show()

5.2. Energy consumption

With more people in the world and higher demand for living standards just mean that there are more mouth to feed and harder to suffice. We are consuming more and more energy as individuals. This naturally lead to higher global consumption for energy, hence we need to extract more oil, natural gas, burn more coal, and with great hope, turn more towards green energy sources like solar and wind.

df_energy = df_co2[['country', 'year', 'co2_per_unit_energy','primary_energy_consumption', 'energy_per_capita','cement_co2', 'coal_co2', 'flaring_co2', 'gas_co2', 'oil_co2',
       'other_industry_co2', 'cement_co2_per_capita', 'coal_co2_per_capita', 'flaring_co2_per_capita', 'gas_co2_per_capita', 'oil_co2_per_capita', 'other_co2_per_capita']]

## extract data for all countries from 1749 to 2019
energy = df_energy[(df_energy['year'] > 1749) & (df_energy['year'] < 2021)]

world_energy = energy[energy['country'] == 'World']

fig = px.line(world_energy[world_energy['energy_per_capita']!=0], 
        x="year", 
        y="energy_per_capita", 
        title="World Energy consumption per capita (1965-2019)",
        range_x=[1964, 2020],
        range_y= [10000, 24000], markers=True)
fig.update_layout(
    {
        'yaxis': {'title': 'kilowatt-hours (kWh)'}
    }
)
world_fuel = pd.melt(world_energy[['year','cement_co2', 'coal_co2', 'flaring_co2', 'gas_co2',
       'oil_co2', 'other_industry_co2']],id_vars=['year'],var_name='CO2_energy', value_name='value')
## convert to billions tonne 
world_fuel['value'] = world_fuel['value']/1000
fig = px.line(world_fuel[world_fuel['value']!= 0], 
        x="year", 
        y="value", 
        title="CO2 emissions by fuel type (1750-2019)",
        color = 'CO2_energy',labels = {'CO2_energy':'CO2 by fuel type'})
fig.update_layout(hovermode='x unified', yaxis_title= 'CO2 emissions (billion t)')
fig.show()

The production of energy of course contributes to CO2 emissions, click the play buttom below to see how much each fuel/energy type contribute to world’s CO2 emissions.

## find the total co2 emission from each energy type
world_energy['total_co2'] = world_energy.iloc[:,5:11].sum(axis=1)
## calculate share of co2 emission of each energy type
fuel_prc = world_energy.iloc[:,5:11].div(world_energy.total_co2,axis=0)*100
fuel_prc['year'] = list(world_energy.year)

years = fuel_prc.year
trace1 = go.Scatter(x=years,
                    y=fuel_prc['coal_co2'],
                    name = 'Coal',
                    mode='lines',
                    line=dict(width=1.5),
                    visible=True)
trace2 = go.Scatter(x = years,
                    y = fuel_prc['oil_co2'],
                    name = 'Oil',
                    mode='lines',
                    line=dict(width=1.5),
                    visible=True)
trace3 = go.Scatter(x = years,
                    y = fuel_prc['gas_co2'],
                    name = 'Gas',
                    mode='lines',
                    line=dict(width=1.5),
                    visible=True)
trace4 = go.Scatter(x = years,
                    y = fuel_prc['cement_co2'],
                    name = 'Cement',
                    mode='lines',
                    line=dict(width=1.5),
                    visible=True)
trace5 = go.Scatter(x = years,
                    y = fuel_prc['flaring_co2'],
                    name = 'Flaring',
                    mode='lines',
                    line=dict(width=1.5),
                    visible=True)
trace6 = go.Scatter(x = years,
                    y = fuel_prc['other_industry_co2'],
                    name = 'Other industry',
                    mode='lines',
                    line=dict(width=1.5),
                    visible=True)

frames = [dict(data= [dict(type='scatter',
                           x=fuel_prc['year'][:k+1],
                           y=fuel_prc['coal_co2'][:k+1]),
                      dict(type='scatter',
                           x=fuel_prc['year'][:k+1],
                           y=fuel_prc['oil_co2'][:k+1]),
                      dict(type='scatter',
                           x=fuel_prc['year'][:k+1],
                           y=fuel_prc['gas_co2'][:k+1]),
                      dict(type='scatter',
                           x=fuel_prc['year'][:k+1],
                           y=fuel_prc['cement_co2'][:k+1]),
                      dict(type='scatter',
                           x=fuel_prc['year'][:k+1],
                           y=fuel_prc['flaring_co2'][:k+1]),
                      dict(type='scatter',
                           x=fuel_prc['year'][:k+1],
                           y=fuel_prc['other_industry_co2'][:k+1])
                     ],
               traces= [0, 1, 2, 3, 4, 5],  
              )for k  in  range(1, len(fuel_prc)-1)]

layout = go.Layout(hovermode='x unified', updatemenus=[dict(type="buttons", direction="right", x=0.9, y=1.16), 
        ],
        xaxis=dict(range=[1749,2020],
                   autorange=False, tickwidth=2,
                   title_text="Year"),
        yaxis=dict(range=[-5, 105],
                   autorange=False,
                   title_text="%"),
        title="Percentage share of CO2 emissions by energy/fuel type (1750-2019)",
    )
                  
fig = go.Figure(data=[trace1, trace2, trace3, trace4, trace5, trace6], frames=frames, layout=layout)

fig.update_layout(
    updatemenus=[
        dict(
            buttons=list([
                dict(label="Play",
                        method="animate",
                    args=[None, 
                                  dict(frame=dict(duration=5, 
                                                  redraw=False),
                                                  transition=dict(duration=0),
                                                  fromcurrent=True,
                                                  mode='immediate')])
            ]))])
fig.show()

As we can see coal burning has dominated world’s energy generation from the early industrialisation years and still is the most used fuel type together with oil, thereby still the largest contributer to CO2 emissions. From early 1990s we start to see the rise of other energy sources, which includes renewable energy and there seems to be a steady growth.

5.3. Human acitivities and Wealth (GDP)

So we are using more energy than ever before, but for what? The answer is that we travel and transport more, we are manufacturing more goods, we construct more buildings, we further industrialise our productions.

Please play around with the pie chart to see how different kinds of human activity contribute a country, continent and the world’s CO2 emissions.

def countryname_to_continent(country_name):
    try:
        country_code = pc.country_name_to_country_alpha2(country_name, cn_name_format="default")
        country_continent_code = pc.country_alpha2_to_continent_code(country_code)
        country_continent_name = pc.convert_continent_code_to_continent_name(country_continent_code)
    except: 
        country_continent_name = country_name
        
    return country_continent_name

df_human = pd.read_csv('data/co2_humanactivities.csv',sep = ',',encoding = 'unicode_escape')
df_human['Continent'] = df_human['Country'].apply(lambda x: countryname_to_continent(x))
## filter out to keep only countries 
df_human = df_human[df_human['Continent'].isin(['Africa','Asia','Europe','North America','South America'])]
fig = px.sunburst(df_human, path=['Continent', 'Country','Sector'], values='2018', color='Continent',width=1300, height=800)
fig.update_traces(textinfo="label+percent entry ")
fig.update_layout(
    title = 'CO2 emissiosn by human activities',
)
fig.show()
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/var/folders/wv/j_lqm8yn4ndb1hvhthjcrb540000gn/T/ipykernel_19339/2666037783.py in <module>
      9     return country_continent_name
     10 
---> 11 df_human = pd.read_csv('data/co2_humanactivities.csv',sep = ',',encoding = 'unicode_escape')
     12 df_human['Continent'] = df_human['Country'].apply(lambda x: countryname_to_continent(x))
     13 ## filter out to keep only countries

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    309                     stacklevel=stacklevel,
    310                 )
--> 311             return func(*args, **kwargs)
    312 
    313         return wrapper

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
    584     kwds.update(kwds_defaults)
    585 
--> 586     return _read(filepath_or_buffer, kwds)
    587 
    588 

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in _read(filepath_or_buffer, kwds)
    480 
    481     # Create the parser.
--> 482     parser = TextFileReader(filepath_or_buffer, **kwds)
    483 
    484     if chunksize or iterator:

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in __init__(self, f, engine, **kwds)
    809             self.options["has_index_names"] = kwds["has_index_names"]
    810 
--> 811         self._engine = self._make_engine(self.engine)
    812 
    813     def close(self):

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/readers.py in _make_engine(self, engine)
   1038             )
   1039         # error: Too many arguments for "ParserBase"
-> 1040         return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
   1041 
   1042     def _failover_to_python(self):

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py in __init__(self, src, **kwds)
     49 
     50         # open handles
---> 51         self._open_handles(src, kwds)
     52         assert self.handles is not None
     53 

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py in _open_handles(self, src, kwds)
    220         Let the readers open IOHandles after they are done with their potential raises.
    221         """
--> 222         self.handles = get_handle(
    223             src,
    224             "r",

/opt/homebrew/Caskroom/miniforge/base/envs/cforge_env/lib/python3.9/site-packages/pandas/io/common.py in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
    699         if ioargs.encoding and "b" not in ioargs.mode:
    700             # Encoding
--> 701             handle = open(
    702                 handle,
    703                 ioargs.mode,

FileNotFoundError: [Errno 2] No such file or directory: 'data/co2_humanactivities.csv'

If we use more energy in order to meet higher demands of production also means that we will generate more wealth, hence growing the economy. It is shown that there are clear correlation between increasing wealth and CO2 emissions.

wealth = df_co2[['year', 'country','continent','co2', 'co2_per_capita','gdp','co2_per_gdp','population','share_global_co2']]
world_wealth = wealth[(wealth['country'] == 'World') & wealth['co2_per_gdp']]
world_wealth['co2'] = world_wealth['co2']/1000

fig = px.scatter(world_wealth, 
        x="gdp", 
        y="co2", 
        title="CO2 emissions vs. GDP (1965-2019)",trendline='ols')
fig.update_layout(yaxis_title= 'CO2 emissions (billion t)', xaxis_title= 'GDP ($)'
)

However, not everyone creates the same size of carbon footprint. We know that wealth is not distributed equally between us, some live in extreme poverty and some can get their hands on almost every resources possible. And this would also mean that we contribute differently to CO2 emissions. Let’s take a look at how CO2 emissions per person changes depending on how rich individuals are in a country in 2018.

## filter out non country rows
wealth_18 = wealth[(wealth['year'] == 2018) & (wealth['continent'].isin(['Asia','Europe','Africa','North America','Oceania','South America']))]
## calculate gdp per capita
wealth_18['gdp_per_capita'] = wealth_18['gdp']/wealth_18['population']
wealth_18['share_global_co2'] = wealth_18['share_global_co2']
## remove rows with missing values
wealth_18 = wealth_18[(wealth_18['gdp']!=0) & (wealth_18['share_global_co2']!=0)]

fig = px.scatter(wealth_18, x="gdp_per_capita", y="co2_per_capita",
	         size="population", color = 'continent',
                 hover_name="country", log_x=True, log_y=True, size_max=60,trendline='ols',trendline_scope="overall", title = 'CO2 per capita vs GDP per capita')
fig.update_layout(
    xaxis_title="GDP per capita ($)",
    yaxis_title="CO2 per capita (t)"
)
fig.show()
fig = px.treemap(wealth_18, path=[px.Constant("world"), 'continent', 'country'], values='co2_per_capita',
                  color='gdp_per_capita', hover_data=['co2'],
                  color_continuous_scale='RdBu',
                  color_continuous_midpoint=np.average(wealth_18['gdp_per_capita'], weights=wealth_18['co2_per_capita']))
fig.update_layout(margin = dict(t=50, l=25, r=25, b=25))
fig.show()

6. Who takes the responsibility?


7. Looking into the future

8. Discussion

References

[1] Carbon Dioxide | Vital Signs – Climate Change: Vital Signs of the Planet. (n.d.). Retrieved May 3, 2022, from https://climate.nasa.gov/vital-signs/carbon-dioxide/
[2] Ali, K. A., Ahmad, M. I., & Yusup, Y. (2020). Issues, Impacts, and Mitigations of Carbon Dioxide Emissions in the Building Sector. Sustainability 2020, Vol. 12, Page 7427, 12(18), 7427. https://doi.org/10.3390/SU12187427
[3] Effects | Facts – Climate Change: Vital Signs of the Planet. (n.d.). Retrieved May 3, 2022, from https://climate.nasa.gov/effects/
[4] Solomon, S., Plattner, G. K., Knutti, R., & Friedlingstein, P. (2009). Irreversible climate change due to carbon dioxide emissions. Proceedings of the National Academy of Sciences of the United States of America, 106(6), 1704–1709. https://doi.org/10.1073/PNAS.0812721106/SUPPL_FILE/0812721106SI.PDF
[5] The Paris Agreement | UNFCCC. (n.d.). Retrieved May 3, 2022, from https://unfccc.int/process-and-meetings/the-paris-agreement/the-paris-agreement
[6] Shi, A. (2001, August). Population growth and global carbon dioxide emissions. In IUSSP Conference in Brazil/session-s09.
[7] Weiss, K. R. (2015). Global greenhouse-gas emissions set to fall in 2015. Nature. https://doi.org/10.1038/NATURE.2015.18965

9. Contributions

  • Website creation and setup: Hanlu

  • Part 1: Introduction and Motivation: ****

  • Part 2: Dataset segmentation and cleaning: ****

  • Part 3: Descriptive statistics about the dataset: ****

  • Part 4: Keyword detection using TF-IDF: ****

  • Part 5: Network Analysis upon the Yelp hotel dataset: ****

  • Part 6: Topic detection using LDA: ****

  • Part 7: Sentiment Analysis upon the hotel reviews: ****

  • Part 8: Discussion: ****